Goto

Collaborating Authors

 text-to-text transfer transformer


Keyword Extraction from Short Texts with a Text-To-Text Transfer Transformer

Pęzik, Piotr, Mikołajczyk-Bareła, Agnieszka, Wawrzyński, Adam, Nitoń, Bartłomiej, Ogrodniczuk, Maciej

arXiv.org Artificial Intelligence

The paper explores the relevance of the Text-To-Text Transfer Transformer language model (T5) for Polish (plT5) to the task of intrinsic and extrinsic keyword extraction from short text passages. The evaluation is carried out on the new Polish Open Science Metadata Corpus (POSMAC), which is released with this paper: a collection of 216,214 abstracts of scientific publications compiled in the CURLICAT project. We compare the results obtained by four different methods, i.e. plT5kw, extremeText, TermoPL, KeyBERT and conclude that the plT5kw model yields particularly promising results for both frequent and sparsely represented keywords. Furthermore, a plT5kw keyword generation model trained on the POSMAC also seems to produce highly useful results in cross-domain text labelling scenarios. We discuss the performance of the model on news stories and phone-based dialog transcripts which represent text genres and domains extrinsic to the dataset of scientific abstracts. Finally, we also attempt to characterize the challenges of evaluating a text-to-text model on both intrinsic and extrinsic keyword extraction.


How To Paraphrase Text Using Python - AI Summary

#artificialintelligence

As writers, we often seek out tools to help us become more efficient or productive. Tools such as Grammarly can help with language editing. Text generation tools can help to rapidly generate original contents by just giving the AI a few keyword ideas to work with. Perhaps this could help end writer's block? This is a debatable question that is best saved for a later time.

  paraphrase text, python, text-to-text transfer transformer, (12 more...)

T5: Text-To-Text Transfer Transformer

#artificialintelligence

With the burgeoning of Transfer Learning, Deep Learning has achieved many wonders. In this article, we'll discuss Google's state of the art, T5 -- Text-to-Text Transfer Transformer Model which was proposed earlier this year in the paper, "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". This paper is essentially a survey of modern transfer learning techniques used in language understanding and hence proposes a unified framework that attempts to combine all language problems into a text-to-text format. We will discuss this approach in greater detail in the coming sections. Moreover, the authors have also open-sourced a new dataset (for facilitating their work) called C4 -- Colossal Clean Crawled Corpus.